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The recent fabrication of graphene nanoribbon (GNR) field-effect transistors poses a cliallenge 
for first-principles modeling of carbon nanoelectronics due to many thousand atoms present in the 
device. The state of the art quantum transport algorithms, based on the nonequilibrium Green func- 
tion formalism combined with the density functional theory (NEGF-DFT) , were originally developed 
to calculate self-consistent electron density in equilibrium and at finite bias voltage (as a prereq- 
uisite to obtain conductance or current-voltage characteristics, respectively) for small molecules 
attached to metallic electrodes where only a few hundred atoms are typically simulated. Here we 
introduce combination of two numerically efficient algorithms which make it possible to extend the 
NEGF-DFT framework to device simulations involving large number of atoms. Our first algorithm 
offers an alternative to the usual evaluation of the equilibrium part of electron density via numerical 
contour integration of the retarded Green function in the upper complex half-plane. It is based on 
the replacement of the Fermi function f[E) with an analytic function f{E) coinciding with f{E) 
inside the integration range along the real axis, but decaying exponentially in the upper complex 
half-plane. Although f{E) has infinite number of poles, whose positions and residues are deter- 
mined analytically, only a finite number of those poles have non-negligible residues. We also discuss 
how this algorithm can be extended to compute the nonequilibrium contribution to electron density, 
thereby evading cumbersome real-axis integration (within the bias voltage window) of NEGFs which 
is very difficult to converge for systems with large number of atoms while maintaining current con- 
servation. Our second algorithm combines the recursive formulas with the geometrical partitioning 
of an arbitrary multi-terminal device into non-uniform segments in order to reduce the computa- 
tional complexity of the retarded Green function computation by evaluating only its submatrices 
required for electron density or transmission function. We illustrate fusion of these two algorithms 
into the NEGF-DFT-type code by computing charge transfer, charge redistribution and conduc- 
tance in zigzag-GNR| variable- width-armchair-GNR|zigzag-GNR two-terminal device covered with 
a gate electrode made of graphene layer as well. The total number of carbon and edge-passivating 
hydrogen atoms within the simulated central region of this device is ~ 7000. Our self-consistent 
modeling of the gate voltage effect suggests that rather large gate voltage ~ 3 eV might be required 
to shift the band gap of the proposed AGNR interconnect and switch the transport from insulating 
into the regime of a single open conducting channel. 

PACS numbers: 73.63.-b, 71.15.-m, 85.35.-p, 81.05. Uw 



I. INTRODUCTION 

The recent discovery of graphene^ — a single layer of 
graphite representing first truly two-dimensional crys- 
ta P — has opened new avenues for carbon nanoelectron- 
ics.'^ The limits on continued scaling of present silicon- 
based electronics are set by the fundamental physical ef- 
fectd^ (such as quantum tunneling of carriers through the 
gate insulator and through the body-to-drain junction; 
dependence of the subthreshold behavior on temperature; 
and discrete doping effects), the most detrimental being 
power dissipated in various leakage mechanisms. This 
is especially dangerous for minimal field-effect transistor 
(FET) dimensions and oxide thicknesses. Following the 
discovery of carbon nanotubes (CNTs), which are rolled 
up sheets of graphene, the exploration of carbon nano- 
electronics over the past decade as a strong contender to 
aging silicon technology has been centered around semi- 
conducting CNTs as the new type of channel for FET 
that also makes possible unconventional transistor de- 
signs.!^ 

Single-wall CNTs bring their unique features into na- 



noelectronics arena, such as ballistic transport or diffu- 
sion with very long mean free paths, high mobility at 
room temperature due to suppressed electron-acoustic- 
phonon scattering, current carrying capacities of the or- 
der of 10^ A/cm^, and one of the largest known specific 
stiffness.'^ However, full integration of CNTs into com- 
plex high-performance nanoelectronic devices has been 
thwarted by several unresolved issues, such as: (i) elec- 
tronic inhomogeneity where random mixture of semicon- 
ducting and metallic CNT (due to uncontrolled distribu- 
tion of diameters and chirality in current synthesis meth- 
ods) degrade device performance; (ii) difficulty in align- 
ing and patterning through standard lithography meth- 
ods suitable for high- volume production because of CNTs 
not being fiat; and (iii) extreme sensitivity to minute 
changes in their local chemical environment.^ 

Graphene shares many of the features of CNT, offer- 
ing large critical current densitie^ and intrinsic mobility 
limit ~ 2x 10^ cm^/Vs at room temperature being higher 
than any of the known inorganic semiconductors.^ Such 
high mobility promises near-ballistic transport and ultra- 
fast switching. Thus, from its inception,'^ application of 
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graphene inFET devices has been a major experimental 
endeavor 

Ho weve r, all graphene-FETs fabricated with wide 
sheet^^ES! have poor ratio of on-state current /on to 
off-state current loS due to the bulk graphene samples 
behaving as a zero gap semiconductor. Nevertheless, 
recent breakthrough fabrication (via chemical deriva- 
tion,E] STM tip drawing^^ and CNT unroUingi^ of 
sub- 10-nm- wide graphene nanoribbons (GNRs), all of 
which are semiconducting, has led to the development 
of GNRFETfP with /on/4ff ratio up to 10*^ which is 
suitable for logic devices. 

Moreover, unusual band structure of graphene has gen- 
erated a plethora of proposals to create devices that have 
no analog in silicon-based electronics. The new function- 
ality brought by the GNR electronic structure, such 
as "valley valves"!^ or difference in transmission proper- 
ties of reflectionless 120° and highly reflective 60° turns 
made of GNRs with zigzag edges, can only be captured 
by quantum transport analysis. At the same time, equi- 
librium interatomic c harge tr ansfer and chemical dop- 
ing by different atom^i^lMlI] or atomic group^^l that 
passivate GNR edges require to model explicitly atom- 
istic structure and corresponding charge density within 
the device. These task s are be yond the scope of popu- 
lar tight-binding model^i^HIMI (projected onto the basis 
of single p^-orbital per carbon atom), or even simpler 
continuous Weyl Hamiltonian describing massless Dirac 
fermions as low-energy quasiparticles close to the charge 
neutrality point .■^ Furthermore, in the nonequilibrium 
state driven by the finite bias voltage one has to com- 
pute self-consistently charge redistribution and the cor- 
responding electric potential in orderjxi keep the gauge 
invariance^^ of the /-V^characteristic^^ intact. 

Finally, virtually every experiment on graphene em- 
ployees gate electrodes to move the Fermi level away from 
the charge neutrality point or shift conduction from elec- 



tron to hole carriers, so that self-consiste nt comp utation 
of the inhomogeneous charge distributiorP^'SIESl induced 
by the gate voltage and its highly n on-trivial effects on 
the band structure of GNR^^zMIlo] jg necessary to un- 
derstand device performance (rather than using unreal- 
istic constant shift of the on-site potential to simulate 
the presence of the gate electrode in the tight-binding 
models^^). 

Thus, the prime candidate capable of handling 
all of these issues within a unified quantum trans- 
port framework'^^ "^^ is the nonequilibrium Green func- 
tion (NEGF) formalism'^'^ combined with the density 
functional theory (DFT) in standard approximation 
scheme&34 ^^^^^i as LDA, GGA, or B3LYP) for its 
exchange-correlation potent ial. The sophisticated al- 
gorithm ci35.36-37,38 , 39,40.4i.42,43l4ll j3yglQpgj implement 

the NEGF-DFT framework over the past decade can be 
encapsulated by the iterative self-consistent loopP^ 

n'"(r) ^ DFT ^ HKs[n(r)] ^ NEGF ^ n°"'(r)- (1) 

The loop starts from the initial input electron den- 
sity n'"(r) =J> employs some standard DFT codtP 
(typically in the basis set of finite-range orbitals 
for the valence electrons which allows for faster 
numerics and unambiguous partitioning of the sys- 
tem into "central region" and the semi-infinite 
ideal leads) to get the single particle Kohn-Sham 
Hamiltonian HKs["-(r)] -h'^\7^/2m + V^ir) 

[^^{r) = VH{r) + V^c{r) + Vc^t{r) is the DFT mean- 
field potential due to other electrons with Vtr(r) being 
the Hartree and T4c(r) the exchange-correlation contri- 
bution; Vcxtij) is the external potential] =^ inversion 
of HKs[«(r)] yields the retarded Green function G^{E) 
whose integration over energy determines the density 
matrix via NEGF-based formula: 
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P=-\ j dEln,[G'{E)]}{E^^in)-\ J d/? G'^(/?) • Im [Si(/;)] • ^^(i?) [/(/?- a*l) -/ (^^ - Mi?)] = Pcq + Pneq- 

— oo — oo 



The matrix elements n(r) = (r|p|r) are the new elec- 
tron density as the starting point of the next iteration. 
This procedure is repeated until the convergence criterion 
llpout _ pin 1 1 <; J ig reached, where (5 ^ 1 is a tolerance 
parameter. 

The representation of the retarded Green function in 
the local orbital basis requires to compute the inverse 
matrix 

G'^(/?) = [/?-HKs[n(r)]-S(/;)]-i. (3) 
The advanced Green function matrix is defined as 



G^iE) = [C^ (£;)]■!■. The non-Hermitian matrix 
S(£;) = Sl(£;) -f ^r{E) is the sum of the retarded self- 
energy matrices introduced by the "interaction" with 
the left [El{E)] and the right [Sb{E)] leads. These 
self-energies determine escape rates of electrons from 
the central region into the semi-infinite ideal leads, 
so that an open quantum system can be viewed as 
being described by the (non-Hermitian) Hamiltonian 
Hopen = HKs[ri(r)] + S(/;). 

The NEGF post-processing of the converged result of 
DFT calculations makes it possible to obtain the current 



3 



through a two-terminal device in terms of the Landauer- 
type formulaP^ 

+00 

HVds)^^ I dET{E,Vds)[f{E~^iL)-f{E^^lR)]. 



This integrates the self-consistent transmission function 
T{E, Vds) = Tr {Tr{E, Vds)G's,iT l{E, Vds)Gls] , (5) 

for electrons injected at energy E to propagate from the 
left to the right electrode under the source-drain applied 
bias voltage hl— fJ-R = sVds- Here Gg ^ is the submatrix 

of C whose elements (5|G"'|1) connect orbitals in the 
first lead supercell (layer denoted as 1) of the extended 
central region "sample -I- portion of the electrodes" to the 
last lead supercell (layer denoted as S) of the simulated 
region. 

The matrices TlME) - i['Sl,r{E) - S[ ,^(S)] = 
— 2ImSi_fl(i?) account for the level broadening due to 
the coupling to the leads.l^ A usual assumption about 
the leads is that the effect of the bias voltage can be 
taken into account by a rigid shift of their electronic 
structure, so that Y;L,R{E,Vds) = T;l^r{E T eVds/2,0) 
and TLM{E,Vds) = Tl,r{E T eVds/2,0) are computed 
in equilibrium and then the shift ±eVds/2 is applied to 
their electronic structure to mimic the applied bias. The 
energy window for the integral in Eq. Q is defined by 
the difference of Fermi functions f{E — ^l) — f{E — ^r) 
of macroscopic reservoirs into which semi-infinite ideal 
leads terminate. The formula ^ is valid only for 
coherent transport, i.e., assuming absence of dephas- 
in^^ due electron-phonon or electron-electron interac- 
tions (beyond those captured by the mean-field treat- 
ment"** ^s'j^ 

Thus, the most demanding computational task of 
the NEGF-DFT framework is the self-consistent eval- 
uation of the density matrix p whose different algo- 
rithmic steps have the foUowing"^^ computational com- 
plexit'^^ in terms of the number of atoms N:^^ (i) the 
computation n'"(r) V°^{r) of the effective potential 
for HKs['^(r)] has complexity O(A^logA^); {ii) the sec- 
ond step, y^(r) HKs['^(r)], has complexity 0{N); 
(in) usual computation of all elements of the retarded 
Green function, HKs[«(r)] — > C, requires 0{N^) op- 
erations; {iv) C p scales as 0{N); and (w) the fi- 
nal step p n°"*(r) also has complexity 0{N). Ob- 
viously, the bottleneck is set by the retarded Green 
funct ion computation. S ince NEGF-DFT computational 
code j35 | 36 | 37 | 38 | 39 | 40 | 4i | 43 | a,re developed and tested for 

small molecules attached to metallic electrodes (where 
they are successful when coupling between the molecule 
and the electrodes is strong enough to diminish Coulomb 
blockade effect^^Sl) ^ they typically evaluate all elements of 
C" by inverting through Eq. ^ the Hamiltonian of the 
extended molecule region. Because this has to be done 
repeatedly through self-consistent loop (IT]), the number 



of atoms in the extended central region "molecule -I- por- 
tion of the electrodes" that can be simulated is limited 
to few hundreds. This bottleneck also prevents realistic 
modeling of single or multipl^^ gate electrodes — instead 
of an additional layer of atoms covering portion of the 
central region, one typically employs a uniform electric 
field in the direction perpendicular to the transport .I^SEi] 

A more subtle reason for the failure of conventionally 
implemented NEGF-DFT codes when applied to systems 
containing large number of atoms is the integration in the 
second term Pnoq in Eq- (|2| which must be performed 
along the real axis since the integrand is not analytic 
anywhere in the complex plan. Although this integra- 
tion is restricted by the Fermi functions to a segment of 
the order of the applied bias voltage, a very fine integra- 
tion grid must be used to capture locations of subband 
edges (introduced by semi-infinite leads) and broadened 
molecule orbitals where sharp peaks in the integrand oc- 
cur. This problem is exacerbated in devices contain- 
ing large number of atoms where the increasing number 
of such sharp peaks — due to van Hove singularities in 
the density of states of the leads or quasi-bound states 
present when different contacts throughout the device are 
not perfectly transparent — can make it virtually impos- 
sible to converge Pncq- 

The present approach in NEGF-DFT algorithms to 
deal with this issue is to move the line of integration 
slightly into the complex plane. However, this effectively 
adds small imaginary part ir] to the Hamiltonian Hopcn 
which, therefore, does not conserve current. For exam- 
ple, direct application of this procedure to experimen- 
tal graphene devices, such as 100 nm long GNRFET of 
Ref. il5j would lead to substantial difference between the 
total current in the left and the right leads. This issue 
is rarely discussed in the usual NEGF-DFT treatment of 
transport through relatively short molecules where such 
violation of current conservation is small. 

Some recent attempts to solve it, such as locating 
the peaks due to quasibound states a nd pa tching the 
non-equilibrium density matrix integraljI^^El] cannot be 
applied to large systems with many such peaks. The 
peaks can be broadened by physi cal dephasing mecha- 
nisms due to electron-electroiP^'^ or electron-phonon in- 
teractions,'^but this drastically changes the NEGF-DFT 
approach by requiring additional and computationally 
very expensive self-consistent loops to calculate extra 
self-energy functionals'^^ due to interactions within 
the device for which the sparsity of the Hamiltonian ma- 
trix Hopcn becomes irrelevant. 

Recent efforts''^ ^^^^ to replace some of the 

algorithms within the NEGF part of the NEGF-DFT 
scheme, such as unfavorable computational complexity 
of bru te for ce matrix inversion^"' or the real-axis inte- 
gratioiJ^^Mi j,^ p^^q, have still not led to self-consistent 
electron density and transport calculations for systems 
composed of more than about a thousand of atoms 
Here we introduce modified NEGF-DFT scheme which 
is based on our novel algorithm for the integrations in 
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Eq. ([2]) combined with the partitioning the nanostructure 
of arbitrary shape into sHces containing much smaller 
number of atoms. The Green function matrices of these 
slices, needed to obtain the electron density within the 
slice, are computed recursively with much more favor- 
able computational complexity than 0{N'^). The num- 
ber of iteration steps within the self-consistent loop is 
further reduced, in the case of nanode vices in equilib- 
rium or in quasi-equilibrium situations (e.g., due to by 
non-zero gate voltage and zero or linear response bias 
voltage), via modified Broyden mixing scheme for input 
and output charge density. We demonstrate the capa- 
bility of our computational code, termed CANNES (car- 
bon nanoelectronics simulator), to treat multi-terminal 
structures containing large number of atoms by comput- 
ing the self-consistent electron density and conductance 
in the presence of the gate voltage in a graphene nanode- 
vice whose extended central region is composed of ~ 7000 
carbon and hydrogen atoms. 



The paper is organized as follows. Sec. |TT] elaborates 
on the "pole summation" algorithm for computing in- 
tegrals in p. In Sec. |III| we demonstrate efficiency of 
our approach by setting up a three-terminal FET-type 
device whose source and drain electrodes are made of 
zigzag graphene nanoribbon (ZGNR) source and drain 
electrodes while its channel is an armchair GNR (AGNR) 
of variable width and with sizable energy gap. The third 
electrode is gate modeled as a rectangularly-shaped layer 
of carbon atoms covering the FET channel. The dangling 
bonds of all graphene layers are terminated by hydrogen 
atoms. The DFT part of the calculation is carried out 
using the self-consistent environment-dependent tight- 
binding model (SC-EDTB) with four orbitals per carbon 
atom and one orbital per hydrogen atom, which is specif- 
ically tailored to simulate eigenvalue spectra, electron 
densities and Coulomb p otenti al distributions for carbon- 
hydrogen nanostructures.'^^'^ The combination of "pole 
summation" algorithm with the recursive Green function 
formulas allows us to compute in Sec. |III| intricate electric 
potential distribution in the space around ZGNR-AGNR- 
ZGNR FET device, as well as to demonstrate how much 
voltage has to be applied on the gate electrode to push 
the device from the off-state due to the gap of AGNR into 
an on-state enabled by a single transport channel crossing 
the Fermi level. The computed source-drain conductance 
as a function of the gate voltage also demonstrates that 
even at zero gate voltage there is a difference between 
the non-self-consistent and self-consistent conductance, 
where the latter takes into account charge transfer be- 
tween different atomic species or different segments of 
the device. We conclude in Sec. IIVI 



II. SELF-CONSISTENT ALGORITHMS FOR 
ELECTRON DENSITY 

We rewrite the equilibrium contribution to the density 
matrix ([2]): 

-f-00 

Peq(/^,T)--^ J dElm[G''{E)]f{fi,T,E), (6) 

in the form which emphasizes its dependence on the 
chemical potential jj. and temperature T, as well as that 
the lower limit of integration is the lowest energy at which 
Im \ G^(Ejn in)] 7^ 0. As long as the end-point i?min is se- 
lectecpfiBH below the bottom of the valence band edge, 
there are no further poles in the integrand, and thus the 
expression is exact. Although this looks obvious, it is 
important to point out that if the value l-Eminl is too 
small, and there are poles left outside of the contour, 
the corresponding poles will not be included in the in- 
tegration. This causes charge to erroneously disappear 
from the system, which typically initiates an avalanche 
effect, pushing the poles even further out, and even more 
charge is lost, until the system is totally void of electrons. 
When this occurs, the calculation will actually converge 
trivially, but to a physically incorrect solution. 

Since diagonal matrix elements of G^{E) are a rapidly 
varying function of energy, a direct integration along 
the real axis would be rather ineffective since its nu- 
merical accuracy is not sufficient to achieve convergence 
of the self-consistent electron de nsity. I nstead, present 
NEGF-DFT computational coded^^^El deform the in- 
tegration contour into the upper complex half-plane 
Im [E] > 0, where the retarded Green function is much 
smoother. This is allowed since G'^{E) is analytic in the 
upper complex half-plane (all of its poles are slightly dis- 
placed below the real axis). 

The thick white line in Fig. [T] designates typically cho- 
ggjj 35 | 36 | 3 9i43i integration contour. It consists of a semi- 
circular part SC and a horizontal line L parallel to the 
real axis on the right which is positioned to enclose spe- 
cific number A^poics of the Fermi function poles z^"-* while 
ensuring that SC and L are sufficiently far away from 
the real axis so that the Green function is smooth over 
both of these two segments [the main variation of the in- 
tegrand on L comes from the Fermi function f{E) which, 
the refore , can be used as a weight function in the quadra- 
turtpfiESj. The final expression for poq obtained in this 
procedure (using the Cauchy residue theorem for the 
closed contour SC + L + vertical segment from L to 
the real axis + portion of the real axis) is: 
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I dEG'^{E)f{ti,T,E) 
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FIG. 1: (Color online) The density plot of the absolute value of f{E) in the upper complex half-plane. Lighter color denotes 
greater value of | / |. Solid black corresponds to zero, while gray color inside the dotted rectangle represents unity. White 
dots denote the poles with their size being roughly proportional to the absolute value of the residue. Poles running along AB, 
BC, and CD edges of the rectangle correspond to z'"-*, a nd , respectively. Thick white curve denotes the integration 
contour traditionally used in NEGF-DFT computational codes.l^^l^^ Top insets are 3D plots of Re [/] and Im [/] in the upper 
complex half-plane. 



where the smoothness of G^(£') on SC + L contour is 
exploited to perform the approximate integration in the 
first t erm b y using a quadrature with a small number of 
points.iafiMsl 

Obviously, it would be advantageous to compute in- 
tegral in Eq. (|6| precisely and without worrying about 
proper selection of parameters for positioning SC and 
L, via a simple summation over a finite set of complex 
energies akin to the second term of Eq. (|7|. Here we 
introduce such an algorithm which makes possible virtu- 
ally exact evaluation of poq by "pole summation." This 
algorithm is discussed separately for high temperatures 
(and/or valence electrons) in Sec. II A and low tempera- 



tures (and/or core electrons) in Sec. II B 



A. High temperature and/or valence electrons 

The algorithm for equilibrium density matrix compu- 
tation discussed in this Section can be used when the 
inequality 



is satisfied. If Eq. ([S]) is not satisfied, a slightly more 
elaborate algorithm described in the next Sec. |IIB| is 
needed. Let us define the desired precision through the 
non-negative number p, such that the magnitude of the 
relative error is (5 < e~P. In most cases the machine pre- 
cision roughly corresponds to p = 30, while the practical 
range of p is usually between 21 and 27. 
We start by introducing a function / 

/(/I, fine, Mim, T, Trc, Tim, E) = /(i/ilm, jli,„, E) X 
(/(Ai,T,i?)-/(^Re,-TRe,i?)) , (9) 

where all its arguments except E are limited to real do- 
main and satisfy the following inequalities (fcs is the 
Boltzmann constant and = — 1): 



TRe>0, Tin, >0, 
MRc ^ E^in — pkBTRc, 
Aim ^ pknTim- 



(10a) 
(10b) 
(10c) 



(A*-£^min)/fcsr<103, 



The choice of parameters given by Eq. ( 10 1 guarantees 



(8) 



that for real E > E^in the function / deviates from / by 



6 



no more than S. Therefore the replacement of / with / 
in the integrand of Eq. (|6| will result in the relative error 
less than i5. In the following we assume that p > 21 so 
that S < 10-9. 

Thus, for all practical purposes we can state that (all 
arguments except E are omitted for brevity) 



Poq = --Im 
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dEG-{E)f{E) 



(11) 



The poles and residues of the first term in the product 
on the right-hand side of Eq. (l9| are given by 



^im = «Wm + ■nkBfi^{2n + 1), 



Res f{i(lini,ifi^,z) 



(12a) 



-ikBTi^. (12b) 



where n is an integer. Similarly, the poles and residues 
of /(/i, T, E) in the second term are 



z^") = ^i + 'KikBT{2n + l), 



(13a) 

Res [/(m,T,z)],^^<„, --fcsT, (13b) 
and for /(/irc, -Trc, E) they are 



^r"o^ = Aro + nikBT^cC^n + 1), 



Res 



/(^Rc,-Tro,2) 



knT] 



(14a) 
(14b) 



Inequalities (10 1 provide sufficient freedom to prevent 



the coincidence of the poles z*-^-*, and z^"' (V j, to, 

and n). Thus, / only has first order poles with residues 
given by: 



Res f{z) = -ikfi^ x 

(/(Ai, T, 4n ) - /(Arc, -Trc, 4m) 



(15a) 



Res 
Res 



/(^) 



= -fc_BT/(i/iim,?T'i,„,z^"-') , (15b) 
= -kBTf{iflim,ifi^,z^j['^) . (15c) 



In the upper complex half-plane the residues ( 15al decay 



exponentially if Ile(S^) lies outside the interval [/iRo, /^], 



and the residues ( |15b ), (15c I decay exponentially if the 
imaginary component of the poles 2^"^ or z^-* exceeds 
jlim- Thus, for any given p only the limited number of 
poles {Zj}, j g {1, A'poic} have non-negligible residues. 



If one replaces the real axis integration in Eq. (11) 



by the integration along the semi-circular contour of the 
sufficiently large radius in the upper complex half-plane, 
the contour contribution to the integral is zero, and the 
contribution from the poles is solely from {Zj}. The 
integral (111 is computed as the sum over all non-zero 



residues: 



Pcq = Im 

TT 



Afpolc 

2TTi Res \f{z) 



G^iZ,) 



(16) 



where the set {Zj} is comprised of only those {^i^ }' 
{z'")}, and {zj^"*} poles which satisfy 

I fi^i, T, 42!) - /(Arc, -Tr,c, 4m) I > e"^ (17a) 
I /(*Aim,«Ti„„z(")) I >e-^ (17b) 

(17c) 



I f{ip,im,ifim,z'^c) I > e ^ 



respectively, in order to keep the relative error below e~^. 

For values of i?niin and T obeying the inequality ([s]) 
and 21 < p < 30 the number of relevant poles A^poic is 
moderate. For example, it is safe to chose i?min = —27 eV 
for valence electrons in a hydro-carbon system (note that 
this value for i?niin is measured from the vacuum level). 
Then, at room temperature the ratio ^ is around 700, 
and for p — 21 the minimal number of required poles for 
parameters satisfying Eq. (10) equals 76. Decreasing p 



down to machine precision raises the minimal number of 
poles to 96. 

Figure [l] shows the density plot of / corresponding 
to p = 21 and -Emin = — 27 eV used to compute self- 
consistent electron within the graphene nanodevice ex- 



ample of Sec. Ill The minimal number of poles A'poio is 



obtained as follows. We consider Tim and Tro as free pa- 
rameters, and the minimum allowed /2r,o and /iim are ob- 



tained from equalities in constraints imposed by Eq. ( 10 1. 
Then, the number of poles z*-") is approximately twice the 
value of /2im divided by the inter-pole distance 



Nai 



2/^1111 
2TrkBT' 



(18) 



and the approximate numbers of poles along the lines CB 
and DC in Fig. [T|are 

NcB = ^ . ^ , (19a) 



2TrkBTij 



DC 



2jl 



Im 



27rfcBTRe 



(19b) 



respectively. The optimal values of Tim and Tro are ob- 
tained by minimizing A'poie = Nab + Nqd + Njjc in the 
space of these two parameters. A small Tro and /iim ad- 
justment, subject to constraints ( |10[ ), is made afterwards 
to place the line CD right in between the two poles on 
lines AB and DC (cf. Figs, [l] and [2]). This is done to 
ensure that the poles are not too close to each other, 
otherwise a large numerical errors may occur. 



B. Low temperature and/or full core simulations 

The minimum number of poles A^poic is scaled by the 
temperature and the energy interval — /xrc. In order to 



7 



(a) 



Number of Poles: 185 



D2 



(b) 



Number of Poles: 121 



(c) 



Number of Poles: 117 



-25 



-20 



-15 
Re[E\ (eV) 



-10 



-5 



FIG. 2: (Color online) The poles with non-zero residues for the same system shown in Fig.[T]but at fcsT = 0.003 eV: (a) poles 
of /; (b) poles of F'^-*; and (c) poles of F'"^-*. Circular zoomed-out regions depict dense pole arrangement at energies close to 
the chemical potential /i. 



reduce A'poio, it is desirable to have as large spacing be- Eq. (20 1. The poles running along the line D1D2 are the 



tween the poles as possible. According to Eq. (|10c 



increasing Tim for the given p means the increase of flim- 
The increase of /lim in turn increases the length of the 
segment AB, and hence the number of poles z*-"-* to be 
summed. On the other hand, reducing the number of 
z*^"^ (i.e., decreasing \AB\—iliya), will bring the line CD 
closer to the real axis, so to prevent deviation of / from 
unity on the real axis requires to decrease Tim . The latter 

(n) 

increases the number of poles along the line CD. 

The simple solution to this problem is to break the 
interval between ji^e and ^ into several sub-intervals. 



and apply the scheme presented in Sec. II A to each sub- 
interval. For example, if the original interval is split into 
two sub-intervals, the substitution for / is 



/(a*: MRci , Mlmi , T, Troi , Tlinj , E) + 
/(Arc 1 ; MR02 5 Alm2 7 2^Roi I ?Rc2 5 21m2 l : 



(20) 



and 



Trci 



where T < Trci < Trc^ ; /iRc2 < A*Rci < 
Aimi</iim2- The parameters /iRci,2i Wmi,2' .... 
and Timi 2 ensure the required precision by satisfying 
the constraints similar to Eq. (10 1: 



sS -E'min -pkBTRc2, (21a) 

Figure |2]^b) illustrates these concepts. Poles forming the 
left (smaller) and the right (bigger) rectangles are asso- 
ciated respectively with the first and the second term in 



same for the first and second term in Eq. ( 20 ) 



The minimization of the total number of poles iVpoie 
is performed analogously to Eqs. (18 1 and (19). For F'^'^^ 



the optimization parameters are Trci , Jro2 , Tir 



and 



Tinii • The starting point for the conjugate gradient min- 
imization is Trci = 10 X T and /iim2 = 10 x /iimj , so that 
the optimized parameters fit this order of magnitude re- 
lationship. Indeed, the size of the integration intervals in 
Fig. |2jb) and [2f^c) increases by an order of magnitude 
from right to left. For this reason A^poie grows logarith- 
mically with increasing ratio (/i — i?inin)/^s2^- That is, 
depending on p, approximately 30 to 40 extra poles are 
required for each decade of this ratio increase (i.e., per 
order of magnitude in temperature reduction) . 



Approximate real axis integration of 
non-analytic functions 



The concepts presented in Sec. |II A| allow for efficient 
and exact evaluation of the G^{E) moments in the inter- 
val bounded by two Fermi functions. This property can 
be used for systematic approximation of G"'{E) with the 
function G^Ie) such that G°'{E) « G"(£:) on the real 
axis, and which is analytic in the upper complex half- 
plane. This approximation can be used to transform the 
non-analytic integrands to analytic functions. 

An obvious applications of this idea to NEGF-DFT 
framework would be the computation of nonequilibrium 
contribution /9„eq to the density matrix in Eq. ([2|. Be- 
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cause the functions G'^{E) and G°'{E) in the integrand 
of Pnoq are non-analytic below and above the real axis, 
respectively, the integrand is non-analytic function in the 
entire complex energy plane. Thus, no integration con- 
tour deformation akin to Fig. [l] can be exploited to avoid 
direct integration along the real axis to obtain Pncq- On 
the other hand, such direct integration along the real axis 
is computational ly exp ensive due to the need for very fine 
integration grids .ISSISS discussed in Sec. |lj integration 
may not even converge when the integrand becomes too 
spiky with numerous closely spaced sharp peaks for de- 
vices containing large number of atoms. 

Let us divide the interval [fiR, (Xl] into AI subintervals 
of equal size A/i 



(22) 



where we assume for simplicity that A/x — iksT. Then 
Pncq in Eq- ([2]) can be rewritten as 

Pncq = j dEG-{E) ■ Im [-EiE)] ■ G'^iE) x 



[fi^l^,T,E)-fi^l,r^-l,T,E)]. 



(23) 



For each interval [/im-i, /i,„] in the sum (23) we ap- 
proximate G^{E) by the power expansion with re- 
spect to the deviation from the center of the interval 

Cm = (Miti-I + /^rn)/2 



K 



■i=0 



where gm' are constant matrices. We require that the 
moments M^^ up to order D for G5„ and coincide 

+00 

mt^= J dEG^{E){E-Uf^ 

— 00 

[f{^im,T,E)~ f{^i,r.^^,T,E)] = 
K 

/ dEY,^\^} x{E~U) 

■J n 



[/(Ai,„,r,i?)-/(Ai,„_i,T,ii;)], 



(25) 



where d C [0,-0]. 

The first integral in Eq. ( 25 1 can be computed accu- 
rately as 

-t-00 

j dEG^{E){E-UYx 

— OQ 

[f{fi^,T,E)^f{fi^_i,T,E)]^ 

+00 

J dEG^{E){E-UY^ 

— OQ 

T,T,Ti^,E). 



lm[E] 




FIG. 



/in 



(Color online) Poles 

-l,/ilm,r, r, flnijZ) used to 



of the function 
evaluate the inte- 
gral in Eq. (261. For a chosen precision set by p = 23, the 



contribution from 28 poles has to be summed. Three poles 
of f{^J.m.,T,z) are marked with the red empty circles. The 
values of the retarded Green function at most of t he p oles 



shown are reused to compute matrices M^''' in Eq. (|25l) for 



n 7^ m, so that the average number of Green functions to be 
computed per interval equals 3. 



Figure [3] shows the poles of / from Eq. ( 26 ) for the case 
p — 23, /2im = STrksT, and Tim — T/n. Even though the 
number of poles to be summed per every moment equals 
28, the number of points per integration interval A/i at 
which G''(i?) needs to be calculated is 3 because the val- 
ues of G^{E) at different poles are reused in computation 

of the moments at different intervals. Thus, Mm"* is com- 



that G''{Zj 



puted similarly to Eq. ( |16| , with the only difference being 



is now replaced by G^{Zj){Zj 



Because matrices gm^ do not depend on energy, the 



integrals in the second term of Eq. ( 25 ) 



+00 



T« = y" dE{E-Urx 

— 00 

[f{^lm,T,E)~f{^i^^,,T,E)], (27) 



can be computed analytically. Here we provide example 
solution of this problem for D = 2 (the solutions for 
D > 2 are similar to this). The integrals are non-zero 
when K is even integer. For example, assuming A/it — 
2k bT they are 



To = 2kBT, T2 = -{keTf (l + n^) , 

T4= ^(fci3T)^(3+107r2 + 77r^). (28) 
15 



(26) Then, to satisfy Eq. (25 1 for d — 0,1,2, matrices g 
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should be chosen as 



(0) _ 




- M}^'Ti 


m 


^ 2 


T0T4 


(1) - 


ivim 




m 


T2 ' 




(2) _ 




- Af^?)T2 


m 


-f 2 
^ 2 


-T0T4 



(29a) 
(29b) 
(29c) 



The analytic continuation of G°'{E) into the upper com- 
plex half-plane is simply 



2 

E 

K=0 



[g(:)]^x(z-e„o" 



(30) 



Then Eq. (23 1 becomes 



1 

Pncq = ^ (i^m ^ ^In) i 



where 



dECiE) ■ T,{E) ■ G'^iE) X 
[fi^l^,T,E)- fi^l„,-l,T,E)]. 



(31) 



+ 00 



(32) 



The integrand in Eq. ( |32[ ) is now analytic in the upper- 
half plane and can be evaluated through our "pole sum- 



mation" algorithm discussed in Sections II A and |IIB| 

The algorithm presented in this Section is actually 
more computationally expensive than the usually imple- 
mented'^^ real axis integration to get Pnoq since for 
every interval one needs to compute the retarded Green 
function at three different points instead of one, as shown 
in Fig. |3] Nonetheless, the benefit of this approach 
is in systematic approximation by exact match of the 
Green function moments which can evade insufficiently 
fine i ntegrat ion grid or, most importantly, uncontrolled 
^gg^g^36|42|43j q£ ^Yie real-axis infinitesimal Hopen + it] that 
leads to serious current non-conservation in long devices 
beyond molecular electronics scale. For example, a very 
large system poorly coupled to its contacts may have sev- 
eral sharp peaks within 10 meV inte rval. None of the 
adaptive real-axis integration methodd^^'^ can properly 
account for these peaks if the integration step equals 10 
meV, while the moments-matching algorithm has capa- 
bility to capture the contribution from these peaks to the 
integral. 



III. EXAMPLE: FIRST-PRINCIPLES 
MODELING OF TOP-GATED GNR-BASED 
NANOELECTRONIC DEVICES 

From the very outset, the discovery of graphene 
has been intimately connected to attempts to fabricate 



carbon-based planar FETs.^^ Since FETs produced us- 
ing micron-size graphene sheets as channels have poor 
lon/IoS ^ 10 ratio, the pursuit of FETs suitable for dig- 
ital electronics applications has shifted toward fabrica- 
tion of GNRs with large band gaptP ~ 0.4 eV. Their 
band gap can be engineered by transverse quantum con- 
finement effects in the case of AGNR (where the gap 
is additionally affected by the increased hopping inte- 
gral between the p^-orbitals on carbon atoms around 
the armchair edge caused by slight changes in atomic 
bonding length in the presence of edge passivating hy- 
drogen^ or by staggered sublattice potential arising 

due to n on-zero spin polarization around zigzag edges 
of ZGNR.'2£lsnisni6i|62l 

The very recent experimentJ^^^^^ElElIil] have demon- 
strated that all sub- 10-nm- wide GNRs are semiconduct- 
ing. Since band gaps due to edge magnetic ordering in 
ZGNR are easily destroyed at room temperature,'^ by 
finite current under nonequilibrium bias voltage condi- 
tions,'2Slor by impurities and vacancies along the edge,!^ 
we assume that AGNRs are essential ingredient to intro- 
duce sizable band gap in graphene nanodevices operating 
at room temperature, as confirmed also by recent tunnel- 
ing spectroscopy.!^ 

The fabricated GNRFETs thus far have utilized metal- 
lic source and drain electrodes where Schottky barrier 
(SB) is introduced at the contact between metallic elec- 
trode (typically Pd with high work function) and GNR, 
so that the current is modulated by carrier tunneling 
probability through SB at contacts. On the other hand, 
planar structure of graphene is envisaged to make possi- 
ble all-graphene electronic circuits patterned from either 
a single graphene plane or multiple planes separated by 
layers of insulating material.^ISi 

Any all-graphene circuit concept will require both ac- 
tive FETs and passive elements for wiring individual cir- 
cuit elements. Although ZGNR can be expected to be 
metallic at room temperature, the wiring based on them 
is nontrivial issue because only few specific ZGNR pat- 
terns have close to ideal conductance and can transmit 
electron flux without losses. Furthermore, at finite bias 
voltage ZGNRs can open a band gap if they are mirror 
symmetric with respect to the midplane between the two 
zigzag edges.ESl 



A. Three-terminal device setup 

Our FET-type device setup, based on the combination 
of ZGNR source and drain metallic electrodes and semi- 
conducting AGNR channel in between them, is shown 
m Fig. g] The source and drain have different widths and 
are modeled as semi-infinite ideal ZGNRs leads. The size 
of the AGNR band gap is an oscillating function of the 
ribbon width. The width variation causing AGNR to 
switch between small and large gaps equals to just a sin- 
gle C-C bond length, which was found to greatly affect 
the transfer characteristics (current / vs. gate voltage 
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FIG. 4: (Color online) Graphical depiction of the atomic 
structure of simulated nanodevice composed of two narrow 
graphene layers. The lower graphene layer contains two 
unidirectional ZGNRs of different width, which act as the 
source and drain metallic electrodes, sandwiching semicon- 
ducting AGNR of variable width as the FET channel. The 
top graphene layer plays the role of a gate electrode, cover- 
ing all of the AGNR channel region, and has the shape of a 
rectangle that is sufficiently large to have negligible band gap. 
The interlayer distance is 3.35 A, which corresponds to the in- 
terlayer spacing in graphite.^'* The hydrogen atoms (red dots) 
passivate edges of both layers, whose internal carbon atoms 
(blue) form defect free finite-size honeycomb lattice. Dark 
and light colored transverse segments, which have variable 
shape as one moves from the source to the drain electrode, 
are used to mark odd and even slices of the partitioned sys- 
tem. Each slice i = 1,. . . ,S is described by the Hamiltonian 
matrix H^^i, all of which are stored in computer memory to- 
gether with matrices Hi,i+i describing the coupling between 
adjacent slices i and i -I- 1. 



Vgs at fixed source-drain bias V^s) in the recent studj^^ 
of several FET concepts with AGNR channel. Because 
cutting graphene with atomic precision in order to obtain 
uniform device performance is currently not an option, 
the variable width AGNR seems to be the simplest realis- 
tic path toward making a short semiconducting fragment. 
Above the semiconducting "active region" we place a 
graphene rectangle, which is assumed to have no elec- 
trical contact with the ZGNR-AGNR-ZGNR structure 
below it. This may be achieved by placing boron-nitride 
insulating layer in between. 

We note that the recent analysi^^U (using NEGF for 
simple p^-orbital tight-binding model, which is self- 
consistently coupled to a three-dimensional Poisson 
solver for treating the electrostatics) of dual-gate Schot- 
tky barrier GNRFETs, with uniform width AGNR chan- 
nels and several different types of graphene- or non- 
graphene-based source and drain electrodes, has singled 
out ZGNR-AGNR-ZGNR device concept as an optimal 
one with high enough lon/IoS ratio and advantageous 



features of ZGNR metallic contacts. 

The usage of wide graphene sheets as the channel of 
FET is conceptually diSicult because depending on the 
position of the Fermi level graphene possesses either elec- 
tron or hole conductivity making it impossible to produce 
regions depleted of mobile charge carriers. At the same 
time, the concept of GNR devices allows to build both 
normally- OFF and norm ally-O N transistors based solely 
on the device geometries) ^'' * ^^ ! One of the main benefits 
of graphene in nanoelectronics is its one-atom-thickness 
which leads to very low parasitic capacitance, and there- 
fore allows terahertz cut-off frequencies for all-graphene 
devices and circuits. So far bot h the experiments^ and 
quantum transport simulationd^Ull] i^q^q been focused 
on GNRFETs whose channel is long and narrow semi- 
conducting GNR attached to metallic source and drain 
(such as Pd) contacts while being controlled by metal- 
lic top-gate shifting the band gap. Although such tran- 
sistors play an important role in studying GNR prop- 
erties, they compromise the main purpose of nanoelec- 
tronic devices — the speed. The p arasitic gate-substrate 
or gate-source (drain) capacitance^^ZHHSolfor such hybrid 
metal-graphene structure are orders of magnitude higher 
than capacitance of the channel, and thus substantially 
decrease the transistor speed. Exploring all-graphene na- 
noelectronic devices to reach the optimal speed limit is 
one of the primary motivations for the design concept 
shown in Fig. [4] 

B. System partitioning and the recursive Green 
function algorithm 

The retarded Green function matrix G'^{E), as the cen- 
tral NEGF quantity in phase-coherent transport regime 
which yields electron density through Eq. ^ and current 
via Eq. (|4]), can be computed by direct matrix inver- 
sion in Eq. However, the computational complexity 
0{N'^) of this operation makes it virtually impossible for 
present NEGF-DFT codes (which typically perform this 
brute force operation) to be applied to systems contain- 
ing large number of atoms.'^Thus, first-principles simu- 
lation of transport in large systems can be accomplished 
only if relevant elements of G^{E) can be obtained via 
algorithms that scale linearly with increasing length of as- 
sumed quasi-one-dimensional (QID) device geometry.^^ 

In fact, since only a much smaller submatrix of G'^{E) 
determines transport properties given by Eq. (|4|, the re- 
cursive Green function algorithms®*^ (in serial or parallel 
implementation^®) have commonly been used to compute 
the submatrix ^ and obtain the transmission proper- 
ties of mesoscopic devices.'^ They are based on using 
the Dyson equation, GJ^ = Gq + GgVG^, to build the 
Green- function slice by slice, so that the dimensions of 
the matrices that have to be inverted are strongly re- 
duced (Gq is the Green function of some region of the 
device with one of the leads attached, V is the hopping 
matrix between that region and adjacent slice, and Gq is 
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the Green function of the coupled system lead + region 
+ slice). 

Thi s type of algorithms have also been ex- 
tendecPHSZElMIZni to obtain other submatrices of C 
needed to compute local quantities within the simulated 
region, such as G[ j or G[j^]^ which define the electron 
density within slice i or spatial profile of local currents 
between slices i and i + 1, respectively. Although it is 
often considered^ that standard or extended recursive 
Green function algorithms can be applied only to QID 
two-terminal devices, some alternative approaches which 
invert smaller matrices than the full device Hamilto- 



nian Hopen to build the Green function of multi-terminal 
nanostructures of arbitr ary ge ometrical shape have also 
been introduced recently.lSSEoi 

The key issue for a successful inclusion of the recur- 
sive Green function formulas into NEGF-DFT codes is 
not the specific set of equations, which is very similar in 
different approaches, but the ability to make a consistent 
partition of a system of arbitrary shape and with many 
attached electrodes into slices described by much smaller 
matrices i. The full Hamiltonian matrix can then be 
written as 



H 



KS 



1,1 



H 



1,2 



Hlr. 



Hi . 







H 



\ 





s-i,s 



(33) 



since due to the finite range of basis functions in the 
transport direction the size of the slices can always be 
chosen so large that only neighboring ones are coupled 
through each other via the hopping matrices Hi.i+i. 

An example of the solution to this primarily geometri- 
cal problem is illustrated using the device setup in Fig. |4] 
Our algorithm here starts from the bitmap image of the 
device — > converts the image into a finite-size honeycomb 
lattice — > then attempts to partition the device within a 
loop until consistent set of slices is achieved across the 
whole device. The final result — a set of slices of non- 
uniform shape (in contrast to typical columns of sites 
orthogonal to the axis of the device when recursive algo- 
rithm is applied to two-terminal QID devices of simple 
shap#2lMI) — is shown in Fig. [4] as dark and light colored 
segments of the honeycomb lattice. 

Each slice is described by a matrix Hi^i containing 
the interactions between atoms within the layer i (i — 
1,. . . ,5"). The size of the matrix H^^i is Ni x Ni, where 
Ni is the total number of atomic orbitals for all atoms in 
the slice i. These matrices are much smaller than H, and 



are stored in memory at the beginning of the calculation 
together with matrices H^^i+i. 

Starting from the set of matrices H^^i and H^^i+i, we 
implement the simplest recursive Green function algo- 
rithm aimed at getting G^j from which we can compute 
the density matrix pi of slice i by replacing G*" in Eq. ^ 
with G[ j. The retarded Green function G[ ^ of each slices 
is given by: 

GUE) = [M,., - H,,, - lll\E) - S^(i?)]-i. (34) 

where T,^£{E) and Y,^^{E) are the self-energies due to the 
rest of the device on the left and on the right, respectively, 
attached to slice i (I^^i is the unit matrix of the same size 
as Hj^i). 

The self-energies S^'(i5) generated by the left side of 
the device attached to slide i are computed through the 
recursive formula which starts from the self-energy of the 
left semi-infinite ideal electrode, Y1l{E — cUl) = Hq ■ 
g'j^{E — cUl) ■ Ho.i, and proceeds through 
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H 



2,3 



[£^12 



Ho 



H 



2,3, 



(35a) 
(35b) 



H 



s-i,s 



[EI 



S-1,S-1 



H 



s-i.s- 



S-2,S-2 



(i?)]-i • Hs_i^ 



(35c) 
(35d) 



Here g£(£') is portion of the retarded Green function 
of the isolated lead connecting atoms in the edge prin- 
cipal layer that is coupled to the extended central re- 
gion via Hq 1. The same recursion starts from the right 
semi-infinite ideal electrode to generate the self-energies 
I]^*(i?), where the self-energy of the right semi-infinite 
ideal electrode, 1:r{E - bUr) = Us^s+i ■ SniE - cUr) ■ 
H^g_i_-^, and the Hamiltonian iis,s of the first slice S 
on the right side of the extended central region are used 
to construct the starting equation of the recursion anal- 
ogous to Eq. ( |35a[ ). 

We note here that the usual simplification in 
NEGF-DFT codes is to treat the extended central region 
out of equilibrium while electronic structure of the ideal 
semi-infinite leads is computed in equilibrium, thereby 
ignoring the self-consistent response of the leads to the 
current. Although it has been pointed outP^that this ap- 
proximation can be incompatible with asymptotic charge 
neutrality, this is rarely taken into account. Instead 
of assuming that the equilibrium band structure of the 
leads is rigidly shifted by the bias voltage ap- 
plied between the macroscopic reservoirs to which they 
are attached, we use T^Ul^r satisfying eVds/2 > bUl > 
s-Ur > —eVds/2 as the shifts of the lead on-site ener- 
gies, "SlAE, Vds) = ^l,r{E T sUlm, 0). Here the po- 
tential eUL,R is adjusted after each iteration within the 
self-consistency loop if the total charge on slices 1 and S 
(obtained from Tr pi and Tr ps respectively) is found to 
deviate from the neutral state charge. 

After the self-consistency is reached, the transmission 
T{E, Vds) in Eq. Q is computed from the submatrix < i \ 

G^^ obtained recursively via the Dyson equation by '^\c^ds{E)=if{E-iiL)TL{E) + if{E-iiR)TR{E). (37) 

Then the Keldysh equation 

G<(£;) = G^{E) ■ [S<,,,(£;) + S<,(i?)] . G%E)^ (38) 

allows to eliminate as independent NEGF and ex- 
press the corresponding density matrix 



in Sec. [l] — numerous sharp peaks in the integrand of pncq 
that render real axis integration non-convergent — can 
be solved in principle by including the interaction^SEH 
within the simulated region capable of washing out the 
quantum interference effects (that are, anyhow, seldom 
observed in devices at room temperature). For exam- 
ple, the inclusion of electron-electron correlation effects 
within the GW approximation was demonstratecP^ to 
broaden or remove sharp features in the NEGFs for test 
systems (such as a chain of gold atoms). 

In the presence of such dephasingprocesses, one has 
to resort to the full NEGF formalisrrPSi whose core quan- 
tities are the retarded G'' and the lesser G^ Green 
function describing the density of available quantum- 
mechanical states and how electrons occupy those quan- 
tum states, respectively. Both Green functions can be 
obtained from the contour-ordered Green function de- 
fined for any two time values that lie along the Kadanoff- 
Baym-Keldysh time contour.!^ In addition to the re- 
tarded Sicads and the lesser Sj'^g^j^, self-energy due to 
attached electrodes, the full formalism requires to com- 
pute self-energy functionals due to many-body interac- 
tions within the sample, Sjnt and SJ^^, while using con- 
serving approximatioiJ33 for their expression in terms of 
G'- and G<. 

In the phase-coherent transport regime, Sint — and 
Sj'^j^ — 0, so that the lesser self-energy of non-interacting 
(i.e., mean-field or Kohn-Sham) quasiparticles can be ex- 
pressed solely in terms of the retarded self-energies of the 
leads 



starting from the known retarded Green function G\i 
(34 1 of the first slice on the left: 



^Ii — [Eli,! — Hi, 



S^(£;)]-i.Ht,,,.GU,i. (36) 



Thus, the computational complexity of the retarded 
Green function evaluation is reduced from 0{N^) for the 

~ 3 ~ 3 

full matrix inversion to 3Ni{S + Ni S operations, 
where Ni is the average number of atoms within the slice 
i. This means that the time required to obtain all rele- 
vant submatrices G[ ^ and G^ ^ for the NEGF-DFT al- 
gorithm scales linearly 0{S) with increasing the length 
of the device (i.e., the number of slices S). 

The recursive Green function algorithm helps to re- 
solve only one of the two key problems in the application 
of NEGF-DFT to large devices. The other one discussed 



(39) 



using only G''(i?) and Tl\cads{E), as shown explicitly by 
Eq. (If. 

On the other hand, even the simplest phe- 
nomenological NEGF models of dephasing, such as 
"momentum-conserving" choice H^ntiE) = dG^{E) and 
^hit(-^) ~ dG'^{E) {d measures the dephasing strength) 
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proposed in Ref. HSJ require to solve Eq. (|3| and Eq. ( 38 ) 

as a system of coupled matrix equations involving full 
size matrices in the Hilbert space of the simulated de- 
vice region. For example, in the case of the dephasing 
model of Ref. 46 , this means iterative solving of Eq. ^ , 
with G5= [E - H - "El^^^iE)]-^ as the initial guess, 



and then using converged C to solve Eq. (38 1 as the 



Sylvester equation of matrix algebra. Obviously, in this 
case the sparse nature of H-matrix in Eq. ( 33 1 and the 



corresponding recursive Green function formulas become 
irrelevant for reducing the time it takes to obtain all 
relevant NEGFs in a single step of the self-consistent 
loop 0. 

More realistic description of interactions with the ex- 
tended central region is far more computationally de- 
manding.'*'* '*^ Thus, the only route toward first-principles 
modeling of transport through large devices is to remain 
within the phase-coherent transport regime and develop 
algorithms that can resolve problems in the convergence 
of integration in pneq along the real axis, as discussed in 



Sec. ITTClor by Refs. 1521551 



C. Quasi-Non-Equilibrium Model 

The DFT part of our simulation, which constructs the 
Hamiltonian of the central region as an input for NEGF 
post-processing to obtain the device transport proper- 
ties, is performed by using the SC-EDTB model.I^^MI 
This model accounts for atomic polarization and inter- 
atomic charge transfer in a standard DFT-likc fashion 
while making it possible to use a minimal basis set of four 
Gaussian orbitals per carbon and one orbital per hydro- 
gen atom. The usage of such minimal basis set allows us 
to reduce the size of matrices H; and H^.i+i discussed 
in Sec. |IIIB] without loosing any of the important aspects 
of ab initio input about carbon-hydrogen systems. This 
makes SC-EDTB highly advantageous when treating sys- 
tems with large number of atoms. 

Conceptually, SC-EDTB can be viewed as the pseudo- 
potential DFT scheme with each atom having its own 
atomic orbital basis set adjustable to the local atomic 
environment around this atom. It is a hybrid of the 
non-self-consistent environment-dependent tight-binding 
modei^and a Gaussian-based DFT scheme. Such adap- 
tive behavior adequately compensates for the low pre- 
cision of the minimal orthogonal basis set. In prac- 
tice, SC-EDTB implements the environment dependence 
as the parametrization of Hamiltonian matrix elements 
with respect to the atomic environment, rather than the 
parametrization of the atomic basis set. For example, 
the parameterized part of Hamiltonian matrix elements 
for the atom near the edge of the nanoribbon will be dif- 
ferent from the respective matrix elements in the middle 
of the strip. Similarly, the in-plane Hamiltonian matrix 
elements for a single graphene layer will be different from 
the respective matrix elements in a graphene bilayer. 

The SC-EDTB Hamiltonian matrix elements are the 



sums of parameterized adaptive "TB-like" and non- 
adaptive "true DFT" contributions. The former mainly 
accounts for the covalent bonding, while the latter de- 
scribes interatomic charge transfer, atomic dipole po- 
larization, and on-site variation of exchange potential. 
The extensive comparison of SC-EDTB with large ba- 
sis set DFT calculations indicates that SC-EDTB pro- 
duces more precise and transferable results than minimal 
basis set pseudo-potential DFT schemes. At the same 
time, SC-EDTB is faster than minimal basis set pseudo- 
potential DFT due to: (i) faster computation of matrix 
elements; (ii) unit overlap matrix (i.e. orthogonal basis 
set); and (iii) smaller number of components used for the 
description of electron density (SC-EDTB uses ten inde- 
pendent components, s^, sp^, spy, sp^, pi, pi, pi, p^Py, 
PxPzi PyPz, to describe the electron density at a given 
carbon atom). This allows us not only to capture the 
interatomic charge transfer, but also to account for the 
dipole polarization. 

The compact description of electron density makes 
possible efficient combination of SC-EDTB with conver- 
gence acceleration schemes for both equilibrium and non- 
equilibrium cases, as discussed in Sec. [Xj The more de- 
tailed specification of electron density provided by stan- 
dard DFT codes in local density (or some other) ap- 
proximation'^ will decrease the computation efficiency, 
but will not affect the simulation of graphene devices 
whose operation is based on charge transfer at the scale 
larger than carbon-carbon bond length. To accom- 
modate systems composed of tens of thousands atoms, 
the SC-EDTB part of our NEGF-DFT computational 
code also includes the possibility of multipole expan- 
sion of Coulomb potential and parallelization on dis- 
tributed/shared memory systems. 

Despite 5 A cutoff radius for the orbitals used in 
SC-EDTB, the coupling Hamiltonian matrix elements 
between the top and the bottom graphene layers of the 
system depicted in Fig. [4] have to be masked with ze- 
ros to simulate insulating layer in between. This causes 
the nonequilibrium density matrix ^ in the presence of 
the gate voltage Vgs to evolve into two equilibrium inte- 
grals ([6| 

p..a.(//,y^„T) = -- / dElm[G^{E)]fifi,T,E) 

ncq TT J 

— OQ 

+ CXD 

- ^ 1 dElm [Gl,,,iE)] {fi^, + eVy,,T, E) 

— oo 

-f{^l,T,E)}, (40) 
each of which is evaluated through our "pole summa- 



tion" algorithm encoded by the formula (|16|^ Here Ggg^^^ 



refers to the Green function matrix EqT~(3| computed 
for the whole device, but whose all elements associated 
with atoms in the lower source-channel-drain layer are 
masked with zeros. That is, only those matrix elements 
which correspond to the gate layer are allowed to be non- 
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zero. We assume that the self-consistency of the recur- 
sive Green function algorithm -|- Broyden mixing scheme 
(see Appendix [a]) is reached when ||n°"* — n'"|| < 10^^, 
where the elements of the electron density vector n are 
extracted from the diagonal blocks of the cor respon ding 
PquLi and Pquisi matrices [as discussed in Sec. 



IIIB 



only 

their diagonal blocks are computed from recursively gen- 
erated submatrices G[j(i?) of the retarded Green func- 
tion] . 



D. Results and Discussion 

We first assume zero gate voltage and plot in Fig. [s] 
the self-consistent Hartree potentiaP^l computed via the 
Poisson equation with net charge density due to charg- 
ing of carbon atoms as the source term. The potential 
profiles are evaluated within the planes that are parallel 
to two graphene layers in Fig. [4] and positioned in the 
region between them. The inhomogeneous profiles are 
caused by charge transfer between hydrogen and carbon 
atoms. Furthermore, it is important to emphasize that 
there is approximately 100 meV difference between the 
Fermi levels of the wide fj-^ido and narrow /inarrow source 
and drain ZGNR electrodes, respectively, in the bottom 
graphene layer of the device in Fig. [4] This is caused by 
different ratios of carbon atoms to hydrogen atoms passi- 
vating the zigzag edges in GNRs of different widths. That 
is, the edge hydrogen atoms effectively dope the nanorib- 
jjQjj i9 | 20| 2ij -^j-^ej-g ii^Q level of doping depends on its size 
and geometry. To account for this, the equilibrium Fermi 
level of the whole setup /i = (/iwido + /^narrow )/2 used in 



Eq. ( 40 1 is assumed to be the average of iimdc and narrow 



MnaiTow Such Compensation of the difference in the Fermi 
levels requires a small built-in electric field in our model. 
Room-temperature (T = 300 K) operation is assumed in 
all Figures in this Section. 

Then we apply voltage eVgs = 1 cV to the gate elec- 
trode in Fig. |6] and plot the full three-dimensional spatial 
profile of the electric potential. Further increase of the 
gate voltage to eVgs = 3 eV leads to potential (within a 
geometrical plane in between two graphene layers) shown 
in Fig. [T] The self-consistent atomistic level simulation 
captures the potential variation in the transverse direc- 
tion of the GNRs, as well as possible modifications of 
the band s tructure of GNRs with increasing gate volt- 

127129130) 

In both Figures, we find that the chosen portion of 
metallic ZGNR electrodes attached to th e AGNR chan- 
nel to form the "extended central region" p6|39 | 43 | g^com- 
passing ~ 7000 carbon and hydrogen atoms for self- 
consistent electron density and potential calculations, is 
actually not large enough (despite many ZGNR super- 
cells included into the extended central region) to com- 
pletely screen the effect of the applied electric field via the 
top gate electrode. This is signified by the color of the 
Coulomb potential at the boundaries (marked by hor- 
izontal white lines in Fig. [7| of the "extended central 



region" not being identical to the color of the uniform 
potential along the semi-infinite leads. The total uncom- 
pensated charge at the boundary is approximately 0.03 e 
for eVgs = 1 eV and 0.07 e for eVgs — 3 eV. 



Another feature conspicuous in Fig. |7] is that the on- 
site potential shift experienced by carbon atoms in the 
lower layer is much smaller than expected from the ap- 
plied bias voltage. This unusual screening capability 
of insulating AGNR channel can be attributed to the 
presence of short segments of metallic AGNR due to ei- 
ther particular width of such segments (we do not relax 
the coordinates and edge bonds which is necessary to 
make all three types of AGNR insulating^^") or doping by 
evanescent modes^-^l that decay from ZGNR electrodes 
into AGNR channel thereby generating metal induced 
gap statetP (localized at the ZGNR|AGNR interface). 1231 
This is also refiected in the conductance of our device — to 
shift the band gap of variable- width AGNR by 0.5 eV and 
bring it into single channel conducting regime demands a 
rater large gate voltage eVgs ~ 3 eV (when compared to 
eVgs — half-the-band-gap required to turn uniform semi- 
conducting AGNR into a single channel conductoil^ , as 
shown by the source-drain conductance computed as the 
function of Vgs in Figs. [8Fb)-(d). 



The metallic behavior of ZGNR electrodes is charac- 
terized by the non-zero density of states and finite (zero 
temperature) conductance at the Fermi level Ep. We 
note that in simple nearest-neighbor tight-binding mod- 
eld^ the conductance of infinite ZGNR around the charge 
neutral (Dirac) point Ep = is quantized G = Gq 
[Gq — 2e'^/h is the conductance quantum for spin- 
degenerate transport) due to a single open conducting 
channel (i.e., transverse propagating mode) defined by 
the overlap of edge-localized wave functions .l^liSI Qn the 
other hand, in DFT description (that can be mimicked by 
single pz-orbital tight-binding models which include third 
nearest-neighbor hoppin^^ more complicated subband 
structure of ZGNR leads to three open conducting chan- 
nels'^ around Ep — and G — 3Gq quantized conduc- 
tance for semi-infinite source and drain ZGNR electrodes. 
This is confirmed in the context of our NEGF-DFT ap- 
proach by Fig. Ma.) . 



Comparing Fig. pla) with Fig. 



which are both 



obtained at Vgs = V, highlights the importance of 
self-consistent electron density computation, even in the 
absence of gate voltage effects. We find a marked dif- 
ference in two panels between the position of the gap 
region [over which the transmission function T{E,0) in 
Eq. ([5]) is zero] and conductance oscillations outside of it. 
The conductance in Fig. [8](a) was obtained without com- 
puting charge transfer effects, and could be reproduced 
by popular non-self-consistent tight-binding modelJ^^^^ 
without resorting to full NEGF-DFT formalism. 
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FIG. 5; (Color online) Contour plot of the Hartree potential for zero applied gate voltage {Vga = V) in the planes which 
are 0.7 A (left panel) and 0.5 x 3.35 A (right panel) above the lower graphene layer of the system depicted in Fig. [Z] White 
horizontal lines in the ZGNR electrode regions mark the boundaries of the extended central region "AGNR channel + portion 
of ZGNR electrodes" composed of ~ 7000 atoms (for which the retarded Green function is evaluated to obtain electron density 
and electric potential through the self-consistent loop). 



IV. CONCLUDING REMARKS 

The modeling of realistic multi-terminal graphene na- 
noelectronic devices requires quantum transport methods 
that can capture effects of its highly unusual electronic 
propertie s'^ and their dependence on detailed device 
geometry P^ I ^^ I as well as charge transfer (in equilibrium) 
and charge redistribution (out of equilibrium) effects on 
atomistic scale. While quantum transport approaches 
based on simple pre-defined Hamiltonian&i^J cannot han- 
dle all of these issues, the NEGF-DFT framework, which 
generates the self-consistent Hamiltonian of the device 
prior to the calculation of conductance or /- characteris- 
tics, offers a proper methodology for first-principles mod- 



eling of electron transport involving accurate quantum- 
chemical description of atomic scale geometry. 

However, NEGF-DFT simulations thus far have been 
limitecpl'to rather small systems, such as short molecules 
connected to metallic electrodes. Here we address sev- 
eral obvious'^^ and more subtle (Sec. [l| impediments 
that have to be resolved to make possible the applica- 
tion of NEGF-DFT codes to devices containing many 
thousand atoms: (?) computational complexity of the 
retarded Green function calculation, as the main time 
limiting part of the simulation when full Hamiltonian 
matrix is inverted, should scale linearly with the system 
size; (ii) integration of NEGFs to get the equilibrium 
and nonequilibrium part of the density matrix has to be 
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FIG. 6; (Color online) Contour plot of the Hartree potential 
in the plane 0.2 A above the lower graphene layer when the 
applied gate voltage is eVgs = 1 eV. The semiconducting re- 
gion is shifted by approximately 0.35 eV. The potential spikes 
pointing downwards correspond to the hydrogen atoms. Posi- 
tive potential spikes associated with carbon atoms in the C-H 
dipole pairs are truncated to make a clear view of the poten- 
tial inside the conducting channel. Note that the potential 
axis points downwards. 



performed in a way (especially in the case of nonequi- 
librium contribution) which ensures convergence despite 
sharp peaks (due to assumed phase-coherent transport of 
non-interacting quasiparticles) along the real axis whose 
number increases substantially in large systems; and (in) 
the convergence of the self-consistent loop, which repeat- 
edly evaluates (i) and (ii), should be accelerated with 
proper mixing scheme of previous iterative steps that is 
compatible with solution of problems in ( i) and ( ii) . 

The algorithms presented here extend the NEGF-DFT 
methodology to systems containing large number of 
atoms through a combination of: 



(1) The "pole summation" algorithm for the ex- 
act integration of the retarded Green function in 
the expression for the equilibrium part of the den- 
sity matrix offers an alternative to standard nu- 
merical contour integration by replacing the Fermi 
function f{E) with the analytic function f{E), 
which coincides with f{E) inside the integration 
range along the real axis but decays exponentially 
in the upper complex half-plane. Only a finite 
number iVpoie of its poles, which can be found 
analytically, has non-negligible residues, so that 
Peq = Im QfjG''(Zj) where aj are scalars 

given by simple analytical expressions in Eq. (16 1. 
The typical value of iVpoio for valence electrons at 
room temperature is 80, and it increases with the 
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FIG. 7; (Color online) Contour plot of the Hartree poten- 
tial for the applied gate voltage eVgs = 3 eV in the plane 
0.7 A above the lower graphene layer. White horizontal lines 
around the ZGNR electrodes mark the boundaries of the ex- 
tended central region "AGNR channel -I- portion of ZGNR 
electrodes" composed of ~ 7000 atoms. 



ture reduction. 

(2) Possible application of the "pole summation" 
algorithm to tackle the problem of difficult-to- 
converge integration of NEGFs along the real-axis 
(due to numerous sharp peaks in the integrand 
which woul d be impossible to locate and handle 
individuallj'^^Illl for devices contains large number 
of atoms) to obtain Pnoq after its non-analytic inte- 
grand in the entire complex plane is approximated 
with an analytic function in the upper complex 
plane, so that the same type of summation can be 
performed as in the case of poq integral. 

(3) The recursive Green function formulas which. 
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FIG. 8: (Color online) The non-self-consistent (a) and self- 
consistent (b)-(d) source-drain conductance (at linear re- 
sponse bias voltage Vds) of the nanodevice depicted in Fig. |4] 
as a function of energy. The conductances are obtained in the 
absence (a), (b) or presence (c), (d) of the gate voltage Vgs, 
where charge redistribution is computed self-consistently in 
all three cases (b)-(d) [unlike in (a)]. The solid and dashed 
rectangular lines in panel (a) show the conductance quanti- 
zation of the infinite source (wide nanoribbon, solid line) and 
drain (narrow nanoribbon, dashed line) ZGNR electrodes, re- 
spectively. The Fermi level in the case of unbiased gate cor- 
responds to E = 0. 



assuming proper geometrical decomposition of the 
lattice of the device into slices of irregular shape for 
arbitrary nanostructure geometry, makes it possi- 
ble to reduce scaling of the required computing time 
from 0{N'^) for the full Hamiltonian matrix inver- 
sion in the single iteration of the self-consistent loop 
to linear scaling 0{S) [S is the number of slices in 
the transport direction] of the computation of only 
the diagonal blocks of the retarded Green function 
that yield the electron density within the slice. 

In the case of equilibrium or quasi-equilibrium (such 
as generated by non-zero gate voltage and zero or linear 
response bias voltage) situations, we additionally acceler- 
ate convergence of the self-consistent loop for the density 
matrix by using the modified Broyden scheme discussed 
in Appendix |X] which is compatible with the recursive 
Green function algorithm and mixes input and output 



electron density from all previous iterations to generate 
input density for the next iteration step. 

We illustrate the numerical efficiency of the com- 
bination of these algorithms for NEGF part of the 
calculation by integrating it with the DFT code 
(based on the minimal basis set — four localized or- 
bitals per carbon atom and one per hydrogen — tailored 
for carbon-hydrogen systems) to simulate gate voltage 
effects in all-graphene FET-type device. Our simu- 
lated ZGNR|variable-width-AGNR|ZGNR device is com- 
posed of ~ 7000 atoms and employs AGNR of variable 
width (kept below 10 nm) as a realistic semiconductor 
cha nnel acces sible to present nanofabrication technol- 
0gy| ii | i2 | i3 | i4 | rpj^g device does not require atomic preci- 
sion in controlling the width and the corresponding band 
gap when uniform sub-lO-nm wide AGNR are used, while 
exploiting advantageous ZGNR source and drain elec- 
trodes. We also use square-shaped gate electrode cov- 
ering the channel which is made of graphene as well. 
The self-consistent evaluation of the electron density and 
Coulomb potential is required to capture inhomogeneous 
charge distribution and modification o f the GN R band 
structure with increasing gate volt age .'^^'^Mni This re- 
veals that rather large gate voltage is required to shift 
the band gap of variable-width AGNR channel and bring 
this type of top-gated GNRFET into a window of single 
open transverse propagating mode with low scattering 
and heat dissipation. 

The computation of self-consistent electron density 
and electrostatic potential, as the crucial aspect of 
NEGF-DFT approach to quantum transport modeling, is 
indispensable to properly take into account gate voltage 
effects or to ensure the gauge invariancc^^ of the /- Vchai- 
acteristics in far from equilibrium transport .^^Sl In addi- 
tion, we also demonstrate notable difference between the 
zero-bias transmission (i.e., linear response conductance) 
of non-self-consistent and self-consistent modeling. This 
can be attributed to charge transfer effects between edge 
passivating hydrogen atoms and carbon atoms, where 
such edge doping also affects the position of the Fermi 
level of isolated GNRs of different size and geometry. 
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APPENDIX A: BROYDEN MIXING SCHEME 
FOR CONVERGENCE ACCELERATION OF THE 
SELF-CONSISTENT LOOP 

The recursive Green function algorithm discussed in 
Sec. |III Bj drastically reduces the computational complex- 
ity of a single iteration step within the self-consistent 
loop ([T|). Another important ingredient of algorithms 
that can handle systems with large number of atoms is 
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to combine the recursive techniques with the convergence 
acceleration scheme based on proper mixing of quantities 
found in previous steps to produce the input for the next 
step. 

The simplest mixing scheme takes certain fraction e of 
the output electron density n™* from the previous step 
m and the remaining fraction (1 — e) from the corre- 
sponding input nj" to produce input for the next step, 
^m+i — ~ ^)^m + SJ^m'- Finding the optimal value 
for the mixing parameter, typically e ~ 0.1 — 0.01, de- 
pends on the nature of the system (such as, insulating vs. 
metallic or isolated vs. attached to semi-infinite leads). 
This can require few thousand iteration steps to satisfy 
the convergence criterion ||n™* — n™|| < 10^^ we employ 
in our simulation. 

The more sophisticated mixing schemes employ Pu- 
lajEi or BroyderP^I^SllZl algorithms to mix several previ- 
ous steps, where the quantities mixed can be the density 
matrix or Hamiltonian and Green functions'*'* (which can 
be more efficient for open multi-terminal systems where 
the central region does not have a fixed number of elec- 
trons). For a small bias voltage, the self-consistency can 
be achieved by applying the Broyden convergence accel- 
eration method which has two major adva ntages. First, 
the modified second Broyden methocP^I^ is compatible 
with the recursive Green function method discussed in 



Sec. IIIB Second, the Broyden method adds 0{N) ex- 
tra operations, so that the single iteration is not slowed 
down. However, the reduction of the number of iterations 
achieved by the Broyden method is appreciable. 

The Broyden method works well when the correla- 
tion between the electron density and the potential is 
local, i.e., when the local potential distortion results 
in a local self-consistent density change. On the other 
hand, in the case of non-local correlations the Broyden 
method performance rapidly deteriorates. The nonequi- 
librium electron density in the coherent ballistic approx- 



imation constitutes the perfect example when the Broy- 
den method fails. The reason for this is that electron- 
potential correlations becomes completely non-local — the 
change of the potential at one contact can shut off the 
electron flux through the entire system and cause the 
system-wide electron density redistribution. Thus, in far- 
from -equil ibrium cases other mixing schemes have to be 
used.™! 

In parti cular, the modified second Broyden 
methocpSEZI jg compatible with the recursive Green 
function method discussed in Sec. |IIIB| and makes it 
possible to reduce the number of iteration steps to the 
order of ~ 10. In this scheme, an input electron density 
for iteration m + 1 is constructed from the set of input 
and output densities generated in all previous iterations: 



'■m+l 

F 

w,. 



nZ - eF™ - 5] W, . [*,f . F„, (Ala) 

J =2 

nr-njS, (Alb) 
-e(F, - F,_i) + nj" - r^t-i 

i-l 

-5^W,.[*,f .(F,-F,_i), (Ale) 



i=2 



(F, - F,_i)^ 



(F,-F,_i)^.(F,-F,_i)- 



(Aid) 



F„j, Wj, and $j comprise a relatively 
small set of vectors to be stored in computer memory. 
The compatibility of this modified Broyden scheme with 
the recursive Green function algorithm of Sec. |IIIB] stems 
from the fact that only diagonal blocks of C", required to 
construct vectors in Eq. (Al), arc computed recursively 



without knowing the full Green function needed in some 
other mixing schemes ) ^^ ' ^'* ' 
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